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Abstract — With the increasing popularity of Cloud computing 
and Mobile computing, individuals, enterprises and research 
centers have started outsourcing their IT and computational 
needs to on-demand cloud services. Recently geographical load 
balancing techniques have been suggested for data centers hosting 
cloud computation in order to reduce energy cost by exploiting 
the electricity price differences across regions. However, these 
algorithms do not draw distinction among diverse requirements 
for responsiveness across various workloads. In this paper, we use 
the flexibility from the Service Level Agreements (SLAs) to dif- 
ferentiate among workloads under bounded latency requirements 
and propose a novel approach for cost savings for geographical 
load balancing. We investigate how much workload to be executed 
in each data center and how much workload to be delayed and 
migrated to other data centers for energy saving while meeting 
deadlines. We present an offline formulation for geographical 
load balancing problem with dynamic deferral and give online 
algorithms to determine the assignment of workload to the data 
centers and the migration of workload between data centers 
in order to adapt with dynamic electricity price changes. We 
compare our algorithms with the greedy approach and show that 
signiflcant cost savings can be achieved by migration of workload 
and dynamic deferral with future electricity price prediction. We 
validate our algorithms on MapReduce traces and show that 
geographic load balancing with dynamic deferral can provide 
20-30% cost-savings. 

Index Terms — Cloud Computing; Data Center; Deadhne. 

I. Introduction 

The increase in energy prices along with the rise of cloud 
computing, brings up the issue for making clouds energy 
efficient; as according to an EPA report, servers and data 
centers consumed 61 billion Kilowatt at a cost of $4.5 billion 
lUl. Moreover, the ability to dynamically track electricity price 
variations due to enhancements to the electrical grid, raise the 
possibility of utilizing cloud computing for energy efficient 
computing. Recently there has been a lot of exploration 
on this topic, searching for opportunities to reduce energy 
consumption in the context of cloud |2|, f3\, f4\, f5\, f6\, fl]. 
While there are a number of hardware and software techniques 
for energy savings considering different aspects, one non- 
conventional perspective is to utilize the predetermined service 
level agreements (SLAs) for energy efficiency. Often there is 
flexibility in the specification of SLAs and the system could 
use that flexibility to improve the performance and efficiency 
||8l , 131 . Specifically, latency is an important performance 
metric for any web-based services and is of great interest 
to service providers who are responsible for services on the 



cloud. The goal of this paper is to utilize the delay or latency 
requirements to make cloud computing more energy efficient. 

Naturally, energy efficiency in the cloud has been pursued 
in various ways including the use of renewable energy LIOJ , 
ifTTIl . Il73l . Ifljl and improved scheduUng algorithms E), Q, 
lfT2ll . ifTSl . etc. Among them, improved scheduling algorithm 
is a promising approach for its broad applicability regardless 
of hardware configurations. The idea of utilizing SLA infor- 
mation to improve performance and efficiency is not entirely 
new. Recent work explores utilization of application deadline 
information for improving the performance of the applications 
(e.g. see fE], f9]). But the opportunities for energy efficiency 
remain unexplored. In this paper, we utilize the flexibility from 
the Service Level Agreements (SLAs) for different types of 
workload to reduce energy consumption. 

We consider the problem of geographical load balancing in 
the cloud (Figure [TJ. In cloud computing, each center of execu- 
tion (data centers) are usually located in different geographic 
locations which are often in different time zones. Due to the 
increase in cost of energy, the electric billing companies have 
different pricing rates for electricity at different locations and 
at different times of the day. Hence load balancing decisions 
should take into account the current time zones and locations 
of data centers during task assignment for minimizing the total 
cost of energy consumption in the cloud. In this paper, we 
investigate and analyze how the pricing of energy at different 
times of the day along with 'task migration' account for the 
decisions for task assignment and deferral in the cloud. We 
use deadline information to defer some tasks so that we can 
reduce the total cost for energy consumption for executing the 
workload depending on time and location. 

The contribution of this paper is twofold. First, we present 
a simple but general model for geographical load balancing 
and provide an offline formulation for solving the problem 
with deadline requirements. For each time slot, the formulation 
determines the assignment of workload to data centers and the 
migration of workload between data centers to adapt with the 
dynamic electricity price variation. 

Second, we design an online algorithm for geographical 
load balancing considering migration and prediction error. The 
algorithm uses migration to improve the performance in case 
of prediction errors. We show that no online algorithm has 
constant competitive ratio with respect to the offline algorithm 
because of the uncertainty in electricity price variation. This 
allows us to compare our online algorithm with a simpler 




Fig. 1. Geographical Load Balancing. 



online algorithm without migration and prediction error and 
to determine a bound on the cost based on the prediction 
error. We then prove that an online algorithm with migration 
gives better cost savings than the online algorithm without 
migration, with future electricity price prediction. We validated 
our model by experiments using MapReduce traces as dynamic 
workload and found 20-30% total cost savings. 

The rest of the paper is organized as follows. Section II 
presents the model that we use to formulate the optimization 
and gives the offline formulation. In Section III, we present 
the online algorithm for determining workload assignment and 
migration dynamically for uniform and nonuniform deadline. 
Section IV shows the experimental results. In Section V, we 
describe the state of the art research related to geographical 
load balancing and Section VI concludes the paper. 

II. Model Formulation 

In this section, we describe the model we use for geograph- 
ical load balancing via dynamic deferral. The assumptions 
used in this model are minimal and this formulation captures 
many properties of current geographical load balancing and 
workload characteristics. 

A. Workload Model 

We consider a workload model where the total workload 
varies over time. The time interval we are interested in is 
t G {0, 1, . . . , r} where T can be arbitrarily large. In practice, 
T can be a year and the length of a time slot r could be as 
small as milliseconds for service requests (e.g. HTTP) or as 
large as several minutes for batch-like jobs (e.g. MapReduce). 
A basic assumption of our model is that energy (electricity) 
costs may vary in time, yet remain fixed within time slot length 
T. To facilitate the future price prediction, we denote the set 
of the time slots in a 24-Hour time frame by /C C T. In 
our model, the jobs have length less than r and each job has 
deadline D (in terms of number of slots) associated with it 
within which it needs to be executed where D is a nonnegative 
integer. The value of D can be zero for interactive jobs and 
large for batch-like jobs. If the length £ of a job is greater than 
T then we can safely decompose it into small pieces (< r) each 
of which is released after the execution of the preceding piece. 
If the job is preemptive then we assign deadline lD/£\ — 1 to 



each of the pieces, else for a non-preemptive job, we assign 
deadline of D — £ for the first piece and deadlines of zeros 
for the other pieces. Thus large jobs are decomposed into 
small jobs. Hence we do not distinguish each job, rather 
deal with the total amount of workload. First we consider 
the case of uniform deadlines, that is, deadline is uniform 
for all workloads, followed by non-uniform deadline case in 
Section HIE. Let Lt be the amount of workload released at 
time slot t. This amount of work must be executed by the end 
of time slot t + D. 

In our model, we consider a large computing facility 
("cloud"), consisting of n data centers. At each time t, the 
total workload Lt arrive at a central dispatcher from which 
load balancing decisions are made. We assume that the arriving 
workload cannot be stored at the dispatcher, i.e., the workload 
arriving at the beginning of time t needs to be dispatched to 
the data centers after the assignment of workload for each 
data center is determined. After the load balancing decisions 
are made at the dispatcher, the jobs can be stored at each data 
center to be executed at a suitable time before deadline. We 
are not concerned about the computation capability (homoge- 
neous/heterogeneous) inside each data center rather we focus 
on load distribution considering data centers as computation 
units. The total computation capacity A/,; in data center i is 
fixed and given for 1 < i < n. We normalize Lt by the 
processing capability of the data centers i.e. Lt denotes the 
computation units required to execute the workload at time t. 

Let Xi^d.t be the portion of the released workload Lt that is 
assigned to be executed at data center i at time slot t + d. Let 
Xi^t be the total workload assigned to be executed at time t to 
data center i and xt be the total assignment at time t. Then 
< Xi^t < Mi and 

D n 

Xi,d,t-d = Xi^t and ^ Xi^t = Xt 

d=0 i=l 

We assume that the energy prices vary unpredictably de- 
pending on time and location. The workload assigned to one 
data center can be migrated to other data centers in order 
to reduce the total energy consumption. Let Zi.j,d,t be the 
amount of workload that is migrated at time t from data 
center i to be executed at data center j at time t + d. Then 
'^i.3,t — X^dLo ^i,j,d,t, is the total amount of workload that is 
migrated form data center i to j at time t. We assume that there 
is a cost associated with each migration and all migrations are 
done as soon as the migration decisions are made. We also 
assume that migration time is negligible with regard to the time 
interval r i.e. migration does not incur any delay. For service 
requests, r is small as well as migration time is negligible 
and for batch-like jobs, r is large in the range of minutes 
and migration time is in the range of seconds. Therefore with 
respect to r, migration time is small (negligible). 

Since some portion of the assigned workload is migrated to 
other data centers, the workload that is executed at time t at 
data center i is the sum of the assigned workload and the net 
migrated-in workload as denoted by 
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yi,t = Xi,t + X! X! ^3,i^d.,t-d ^ X! X! ^iJAt-d (1) 
j = l d=0 j = l d=0 

Since the released workload within [l,r] needs to be 
finished within T time slots, the total assignment and execution 
is equal to the total released workload over T time slots as 
given by the following equation. 

T 71 T 71 T 

t=l 1=1 t=l i=l t=l 

Thus there are two important decisions here: (i) determining 
Xi^d,t, assignment of workload to the data centers, and (ii) 
determining Zi,j^d,t, the amount of migrated workload during 
each time slot t. 

B. Cost Model 

The goal of this paper is to minimize the operation cost in 
the cloud which is the sum of the energy costs for executing 
workload at the data centers and the costs for migrating jobs 
between data centers. 

Energy cost: To capture the geographic diversity and vari- 
ation of energy costs over time, we let Ci^t{yi,t) denote 
the energy cost for executing workload j in data center i 
at time slot t. We assume that Ci,t{yi.t) is a nonnegative, 
(weakly) convex increasing function as used in |7|. Note that 
the function itself can change with time, which allows for 
time variation in energy prices. The simplest example for the 
cost function for a time slot is an affine function which is the 
common model for the energy cost for typical data centers: 

Ci.t{yi,t) = + /3i.tyi,t 

where and /S^ t are constants for data center i and time 
slot t (e.g. see ||4|) and yi t is the executed workload to data 
center i at time t. Note that in this model, the load dependent 
component of the function only depends on time, because the 
real time electricity price varies with real time demand. 

Migration cost: The migration cost is the cost for migrating 
workload from one data center to another which accounts 
for bandwidth cost and energy consumed by the intermediate 
devices. The migration cost is proportional to the amount of 
migrated workload which is represented by, 

where bij is constant for migration from data center i to j 
regardless of the migration time Q. Note that there are also 
bandwidth costs associated with the arrival of jobs into the 
cloud and leaving the cloud. But these costs can be easily 
incorporated without changing the problem formulation, as 
they are constant and do not depend on the migration control. 



C. Optimization Problem 

Given the models above, the goal of geographical load 
balancing is to choose the migrating jobs Zij^d,t and the 
dispatching rule Xi,d.t to minimize the total cost during [1,T], 
which is captured by optimization Q. In this formulation, 
constraint ([3J3) represents that the total assignment should 
be equal to the total released workload and constraint (j3};) 
represents that the total workload that is migrated from data 
center i to be executed at time t cannot exceed the assigned 
workload to be executed at time t at data center i. We now 
prove that there is an optimal solution of optimization ([3]) 
where there is no migration. We have the following lemma. 

Lemma 1: In every optimal solution of optimization ([3]l, 
either Zij^d.t = 0, for all time slots t, and deferral d or bij = 0, 
for all (ij). 

Proof: Suppose for a contradiction that the optimal 
solution O contains Zij^d.t > 0, and bi,j > 0, for 
some Then we can construct another optimal solu- 

tion O' where z'-^^^ = 0, '^i,j,d,t by making x'^^ = 

y,^f Then ^ = y,,t, Vi,t and ELi ELi C**,*!?^',*) = 
'Yiit=iYlll=iCi,t{y'i t)- The objective value of the solutions 
obj{0') < objiO) because ELi ELi EU h.j<,,t = < 
Et=i EiLi X]j=i hjZij^t since bij > 0. This contradicts the 
assumption that O is an optimal solution. ■ 

Corollary 2: There exists an optimal solution of optimiza- 
tion ^ where Zi,j^d,t = 0, yi,j,d,t. 

By Corollary |2] migration is unnecessary when all the 
information about workload and energy price are known in 
advance. However, when all the information are not available 
then migration becomes important as investigated in the next 
section. 

Since the operating cost Ci^t(-) is an affine function, the 
objective function is linear as well as the constraints. Hence it 
is clear that the optimization (j3]l is a linear program. Note that 
the workload f in the formulation is not considered to be 
integer. This is acceptable because the number of requests in 
data centers at each time slot is in the range of thousands 
and we can round the resulting assignment with minimal 
increase in cost. If all the future costs and workload were 
known in advance, then the problem could be optimally solved 
as a linear program. However our basic assumption is that 
electricity prices change in an unpredicted manner depending 
on time and location. Therefore, we tackle the optimization 
problem as an online optimization problem. 

III. Online Algorithm 

In this section we consider the online case, where at any 
time t, we neither have information about the future workload 
Lf for t' > t, nor have knowledge about future electricity 
prices. The workload released at time t can be delayed to 
be executed in future time slots if the cost for execution at 
future time slots is less than the current cost. We apply opti- 
mization on the current and delayed workload and distribute 
them in future time slots so that the total cost for execution 
and migration is minimized subject to the future predicted 
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t=i i=i t=i i=i j=i 



subject to ^ ^ Xi^d,t = it Vi (3b) 

i=l d=0 

D-d n D-d 

^^Zi,j,d+k,t-k < ^ Xi^d+kA-k Vi,V(i,Vt (3c) 

fe=0 j = l k=0 

y,,t<M, Vi,Vi (3d) 

a;i,ci,t > 0, j,d,t > yi,yj,yd,yt. (3e) 



prices. In the online algorithm, we decouple the migration 
decision from the assignment decision and apply optimization 
in two levels: (i) dispatcher level and (ii) data center level. 
The dispatcher makes decision about the assignment of the 
incoming workload to data centers based on predicted future 
electricity prices. Then the data centers make decision on 
adjusting the execution of the workload in current and future 
time slots and the migration of workload between data centers 
in case of prediction errors. 

A. Electricity Price Prediction Model 

In this section we illustrate our model for predicting future 
price of electricity. Since only the load proportional component 
of energy cost depends on time, we need to predict /S^ f. Then 
the predicted energy cost function will be 

Ci.tiUi.t) ^ a^ + Pi.tVi^t 

The load dependent electricity price /3i t is announced by 
the utility at location i at the beginning of each time slot 
t and is kept constant during the duration of that time slot. 
However future prices will change independently of past 
prices according to some known probability density function. 
Predicting electricity prices is difficult because price series 
present such characteristics as nonconstant mean and variance 
and significant outliers. We model the prediction noise by a 
Gaussian random variable with zero mean and variance to be 
estimated. In other words, we model future prices within a 24- 
hour time-frame by Gaussian random variables with known 
means, which are the predicted prices, and some estimated 
variance. The mean for the Gaussian distribution is predicted 
by the widely used moving average method for time series. 
The variance for the Gaussian distribution is estimated from 
the history by the weighted average price prediction filter 
proposed in |16|. In this model, variances are predicted by 
linear regression from the previous prices from yesterday, the 
day before yesterday and the same day last week. By using 
two different methods for mean and variance, we exploit both 
the temporal and historical correlation of electricity prices. 
Let /if[x] and (Tf[x] be the predicted means and standard 
deviations for each time slot k on day x for geographical 
location i. Then the mean of the prediction model for Gaussian 
distribution is obtained as follows: 
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jj-i = £o + Y £K-~jf3i,K-~j , yi e n,\/K e K. 
j=o 

Here, Sj are the coefficients for the moving average method 
which can be estimated by training the model over the previous 
day prices. The variance parameter [x] is estimated from the 
history using the following equation: 

^nx] = kiKlx - 1] + fe<[x - 2] + kra^X ~ 7], 

Here, (Tf[x — 1], o-f[x — 2] and (Tf[x — 7] denote the 
previous standard deviation values erf on yesterday, the day 
before yesterday and the same day last week, respectively. The 
coefficients for the weighted average price prediction filter fci, 
k2 and kj are selected from \lS\. 

B. Optimization for Dispatcher 

The dispatcher makes decision on the assignment of the 
workload to the data centers based on the current electricity 
prices and future price predictions. The following optimiza- 
tion applied at the dispatcher determines the assignment of 
workload Xi,d,t to data centers for < d < Z? and 1 <i <n. 

n D 

mill y^'y^Ci^t{xi^d,t-d) 
Xi,d,t — ' — ' 
i=l d=0 

n D D 

+ X] X! X! ^'i,t+k {xi^dA+k-d) (4a) 

i=l k = l d=k 
n D 

subj. to ^^Xi^d,t = Lt (4b) 

D 

0< XI ^^■d.s-d<M, yi,t< s <t + D.(4c) 

d—s — t 

where Ci^t'i) is the predicted cost function at time t' > t 
for data center i and Xi^d,t" is the unexecuted workload at data 
center i that was assigned at time t" < t to be executed at 
time t" + d where t — t" < d < D. Note that greedy method 
can also be applied to compute the optimum assignment for 
the dispatcher. 



C. Optimization for Data Centers 

The predicted electricity prices at time t may contain 
prediction errors which may lead to some badness in the 
assignment. Data centers can migrate workload between each 
other to adjust the assignment to overcome prediction errors 
for minimizing the total cost in the later time slots. The ad- 
justment is made by applying an optimization on the schedule 
for the unexecuted workload for the current and future time 
slots. For each data center i, this optimization makes decision 
on how much workload to execute at time t, how much to 
defer to execute later and how much to migrate to other 
data centers. Note that the workload released at or before 
t, cannot be delayed to be assigned after time slot t + D. 
Hence we minimize the total cost by applying optimization 
on the already released but unexecuted (delayed) workload 
over the interval [t,t + D]. We have two versions of the 
online optimization at data centers. First we formulate the 
optimization without considering migration. Then the more 
general case with migration is considered. 

1) Formulation without Migration: We start with a for- 
mulation for the online case by considering load balancing 
without migration. Although there is no migration, still the 
data centers can improve the assignment by executing the 
delayed workload early in previous time slots without violating 
deadline as shown by the curved arrow in Figure |2] Let Ui,t 
and Wi^t denote the assigned (delayed) and executed workload 
at time t at data center i, respectively. Initially Ui t = Wi t = 0, 
\/t. Then the values of Wi^s for t < s < t + D are obtained at 
each time t by applying the optimization (j5]l. And the values 
of Ui^s for t < s < t + D aie updated each time t from 
the computed Wi,s- The following optimization determines the 
current and future execution variables Wi^s fort < s < t + D. 

n n t+D 

min Ci,t(wi,t) + C'i,s(wi,s) (5a) 

Wi t ^ — ^ ^ — ^ ^ — ' 

i=l i=l s=t+l 

t+D t+D 

subj. to Wi,s = iti,s Vi (5b) 
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r=t 
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\/i,t< s <t + D (5c) 



< < Mi 



\/i,t<s< t + D.{5d) 

Here the constraints (jSj)) and (|5j;) ensure that the assigned 
workload ^ can only be moved to an earlier time slot t < 
t' < s and thus does not violate deadline. 

2) Formulation with Migration: We can utilize the migra- 
tion of workload between data centers to correct the prediction 
errors (to some extent) made by the dispatcher at an early 
slot during dispatching. Let Zij.d.t denote the migration of 
workload from data center i at time t which will be executed 
at data center j at time t + d. Then the values of Ui,s for 
t < s < t + D aie updated each time t from Wi^t, Zi.j^d,t and 
Zj,i.d.t^ with the rule in equation (6 

Then applying the optimization (7 
Zi,j.d.s-d are determined for t < s < t 



the values for Wi^s and 
+D. Then the workload 



Data Center 1 



^ 1 I l^l^l P l"l , 



Data Center 2 




Data Center n 



' t t+1 t+D 
unexecuted workload 

Fig. 2. Optimization on unexecuted (assigned) workload at time t by 
transferring workload to previous time slots (curved arrow) and by migration 
to other data centers (straight an'ow). 



that is executed including the migration at data center i at 
time t is r/j for t < s < t + D, which is determined by the 
following equation 



n D 
E E ^3'i>d,S' 

] = 1 d=a 



''.d,s—d 



n D 

"EE ^^.j^-^ 

3 = 1 d=0 

The optimization (jTjl applied at time t, determines the 
current and future execution variable Wi,s for t < s < t + D 
and the current migration Zi j d^t for < d < £>. 

Here the constraints (j?]?) and (j?]:) ensure that the assigned 
workload s can be migrated to other data centers and can 
only move to an earlier time slot t < t' < s and thus does not 
violate deadline as shown by arrows in Figure [2] Constraint 
(|7}l) ensures that the amount of migration does not exceed 
the unexecuted workload. Then the actual workload that is 
executed at time t at data center i is. 



yi,t = Wi^t + 



7 . ^j,*.o,t 



E ^» .i'O.* 



(8) 



In summary, at the beginning of each time slot, we apply 
the optimization (j4]) at the dispatcher and then the tasks are 
assigned to the data centers. Then the optimization (j?]) for the 
data centers, is applied globally to determine the assignment 
and migration of previously released unexecuted workload. 
Then the migration takes place and the amount of execution 
for each data center is determined by equation ([8]). After that 
each of the data centers execute that amount of workload. 

D. Analysis of the Algorithm 

We now analyze the performance of the online algorithm. 
We first prove that there does not exist any online algorithm 
with constant competitive ratio with respect to the offline 
formulation ([3]). 



Wi,s + X 



i,s — t,t 



Z^7 = l -^3,1, 



s-t+l,t-l 



En 
.7 = 1 



'i,j,s-t+l,t-l 



if s = t + D, 
if t < s <t + D. 
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Zi,j,s-t,t < Wi.s 
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i=l j = l d=0 



t <s <t + D 

\/i,t < s <t + D 

yi,t < s <t + D 
V«,Vj,Vd. 
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(7b) 
(7c) 

(7d) 

(7e) 
(7f) 



Lemma 3: No online algorithm has constant competitive 
ratio with respect to the offline formulation ([3]). 

Proof: Prediction error degrades the performance of the 
online algorithm, hence w.l.o.g. we assume that the online 
algorithm does not have any prediction error i.e. e = 0. We 
prove the claim by adversary method i.e. we consider an adver- 
sary who presents the online algorithm with several different 
instances. Suppose we have only one data center n ~ 1 with 
capacity M. The time slots are {0,1,..., T}. The uniform 
deadline is D < T. And the cost function parameters /3f are 
I3o = K-Pd and ft :^ K ■ 13d for i € {0, 1, . . . , T} - {0, D}. 
Now for determining the assignments x^.t, we consider two 
cases: 

Case 1: {xofl 0} 

In this case, suppose the workload released at time t = and 
t = 1 are Lo and Li respectively and Lo = Li = M. Then the 
online assignment vector has x^.i > for some < d ^ D—1 
where /Sd+i ^ K ■ which can be arbitrarily large. Hence 
the competitive ratio becomes unbounded. 
Case 2: {xd^ — 0} 

In this case, we construct an adversary input by making Lq ~ 
M and Li = 0. The offline algorithm chooses x*jj ^ — Lq but 
the online algorithm chooses a;o,o — Lq. Since f3o = K.j3o, 
the competitive ratio for this case is ~ K. Since K is arbitrary, 
the competitive ratio is not bounded by a constant. ■ 
Since the performance of any online algorithm cannot 
be bounded with respect to offline algorithm, we compare 
the online algorithm with a simple online algorithm without 
migration and without any prediction error We call such 
an online algorithm as A and the online algorithm with 
migration (described in Section IIIC2) as A^. We denote the 
online algorithm with prediction error but without migration 
(described in Section IIICl) as A^. Basically we are going to 
compare the performance of A and A™. We denote the total 
cost from an algorithm A as cost{A). We have the following 



lemma. 

Lemma 4: cosi(A™) < cost{A^). 

Proof: Let and f be the workload executed at time 
t by algorithms and A^ respectively. In the algorithm 
A™, the workload assigned to a time slot t can only move 
to earlier time slot i' < t as illustrated by constraints 
^) and jj). Hence ELi y^,* < 1T^=1 y™f Therefore 

Ay ^ E!if+iEr=iyr. - E:tf+iEr=iyM > 0. That 

means we have Ay more workload to execute in later slots 
s > t for than A™ and due to optimization at A^, 

T:Uc^A^y)+Y.UY.Uh,{^v) < jyUaA^y) for 

any t + l<s<t + D. Since both the algorithms use the 
same energy cost functions, we have cost{A™) < cost{A^). 

m 

According to Lemma |4] incorporating migration into the 
online algorithm reduces the total cost of execution than A^. 
Using this lemma, we now bound the cost for algorithm A™ 
with respect to A by the prediction error e as stated in the 
following theorem. 

Theorem 5: cost(A™) < (1 + e) • cost{A). 

Proof: We first show fliat cost{A^) < (1 + e) • cost{A). 
Then by lemma |4j the theorem holds. If there were no 
prediction error i.e. e = 0, cost{A^) = cost{A). If there 
is a prediction error e > then suppose the electricity 
price predicted for time t by A^ is f3, whereas the actual 
price used in A is /3. Then (3 — e < P < f3 + e. Then 
^^^MA = f?±|y < 1 + I < (1 + g) where y is the executed 
workload. ■ 
Suppose the prediction error follows the Gaussian distribu- 
tion with standard deviation a. Then the probability that the 
prediction error is bounded by e is given by the Chebyshev's 
inequality 

Pri\C-C\>e)<4 



By Theorem |5] the cost savings from the onhne algorithm 
depends on the price variation and the quahty of prediction. 

E. Nonuniform Deadline 

The algorithm described above can be easily extended for 
nonuniform deadline where the deadline requirement is not 
same for all the workload. In this case the workload can be 
decomposed according to their associated deadline. Suppose 
Ld,t > be the portion of the workload released at time t 
and has deadline d, for {) < d < D, where D is the maximum 
deadline. Then we have 
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Then the constraints for Lt in the offline formulation ^p) and 
the online formulation ^p) can be replaced by the following 
constraint: 



i=l A; = 



''i,k,t 



d 

fe=0 



0<d< D 



Then the same algorithm can be applied to get solutions for 
nonuniform deadline. 

IV. Experimental Results 

In this section, we seek to evaluate the cost incurred by the 
algorithms A, and A™ relative to the optimal solution in 
the context of workload generated from realistic data. 

A. Experimental Setup 

We aim to use realistic parameters in the experimental setup 
and provide conservative estimates of the cost savings resulting 
from optimal geographical load balancing. 

Electricity Price: There are two types of electricity markets: 
Wholesale Market and Retail Market. Due to the high con- 
sumption of electricity in data centers, they usually purchase 
electricity from the wholesale markets |i4J. Electricity price 
varies on a 5 minute or 15 minute basis in real time wholesale 
electricity market. Electricity price in this market exhibit 
significant volatility with high frequency variation |5|. 

We run our simulations for four data centers geographically 
located in four different locations. We choose distant locations 
for our experiments. We choose the locations near those power 
grids whose real time electricity prices are publicly available. 
We used the publicly available data from electricity markets 
from Independent System Operator New England (ISO-NE) 
ifTTl . New York Independent System Operator (NYISO) |[l8l. 
Electric Reliability Council of Texas (ERCOT) |[T9l and 
Electricity Market of New Zealand (NZ) EQ). We took the 
locational based marginal prices (LBMP) from the 5 minute 
spot markets for three days (15th, 14th and 8th February, 
2012) and ran our experiments on the prices of 15th February 
using the prices for 14th and 8th for prediction of future 
prices. We use the four locations to have both temporal and 
geographical variation of electricity prices e.g. the time zones 
of New York, New England, Texas and New Zealand are GMT- 
5, GMT-7, GMT-6 and GMT+13 respectively. The variation of 




Time, Hour (EST) 

Fig. 3. Illustration of five minute locational marginal electricity prices in 
real time market on 15th February, 2012 for four different regions (a) New 
England (ISO-NE), (b) New York (NYISO), (c) Texas (ERCOT), (d) New 
Zealand (NZ). 



electricity prices for different locations are plotted in Figure [3] 
with Eastern Standard Time (EST). These graphs indicate 
significant spatio-temporal variation in electricity prices. 

Workload Description: We use two publicly available 
MapReduce traces as examples of dynamic workload. The 
MapReduce traces were released by Chen et al. [21 1 which are 
produced from real Facebook traces for one day (24 hours) 
from a cluster of 600 machines. We count the number of 
different types of job submissions over a time slot length of 
5 minutes and use that as a dynamic workload (Figure [4]i for 
simulation. The two samples we use, represent strong diurnal 
properties and have variation from typical workload (Workload 
A) to bursty workload (Workload B). 

We use time slot length of 5 minutes because the electricity 
prices vary with an interval of 5 minutes. In practice, load 
balancing decisions can be made more frequently with slot 
length size in the range of seconds. We then assign deadline for 
each job in terms of the number of slots the job can be delayed. 
For the case of uniform deadline, we vary deadline D from 
1 — 12 for the simulation. This is realistic because MapReduce 
workloads have deadlines in the range of minutes as deadlines 
from 8-30 minutes for these workloads have been used in the 
literature ll22ll . Il23l . Il24l . For the non-uniform case, we use 
k-means clustering to classify the MapReduce workload into 
10 groups based on the total sizes of map, shuffle and reduce 
bytes. The characteristics of each group are depicted in Table |l] 
where smaller jobs dominate the workload mix, as smaller jobs 
form larger classes and larger jobs form smaller classes. This 
kind of clustering has been used by Chen et al. for classifying 
the workload. For each class of jobs, we assign a deadline 
from 1 — 10 slots such that smaller class (batch jobs) has 
larger deadline and larger class (interactive jobs) has smaller 
deadline. 

Cost benchmark: Currently geographical load balancing for 
data centers typically does not use deferral of workload for 
load balancing Q, Q. Often the load balancing decisions 
are made dynamically using greedy method based on current 
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(a) Workload A (b) Workload B 

Fig. 4. Illustration of the traces for dynamic workload used in the 
experiments. 

TABLE I 

Cluster Sizes and Deadlines for Workload Classification by 

K-MEANS clustering FOR NONUNIFORM DEADLINE 



Cluster 


Workload A 


Workload B 


Deadline 
#slots 


#Jobs 


GB 


#Jobs 


GB 


1 


4878 


0.22 


5632 


0.32 


1 


2 


496 


3.13 


513 


5.85 


2 


3 


196 


9.90 


170 


18.62 


3 


4 


113 


23.49 


100 


39.09 


4 


5 


80 


49.59 


106 


56.52 


5 


6 


49 


85.07 


44 


99.23 


6 


7 


48 


146.67 


26 


160.90 


7 


8 


19 


286.36 


29 


350.62 


8 


9 


13 


620.01 


11 


659.30 


9 


10 


2 


8104.52 


7 


1294.19 


10 



electricity prices without dynamic deferral. Clearly we could 
be energy efficient if we consider deferral of some of the tasks 
and use migration to adapt with the variation of electricity 
prices. We compare the total cost from the offline and online 
(A™) algorithms with the greedy strategy as proposed by 
Qureshi et al. |l5] and evaluate the cost reduction. We also 
compare the total cost for the online algorithms and A™. 

Cost function parameters: The cost function parameter jii^t 
is determined using current electricity price and for 
t' > t, are determined using the electricity price prediction 
models described in Section IIIA. For our simulations, we use 
load independent parameter ai = 0, for all i. The values for 
bi j are determined proportional to the geographic distance 
between data centers i and j. Since the workload cannot be 
migrated from source to source, we use 6^.; to be a large 
number. Depending on the nature of the workload we varied 
the total capacity of the data centers because the algorithms 
keep on assigning the workload to the data center with the 
lowest cost until the data center is overloaded. Choosing a 
maximum capacity value to be less than the peak value allows 
us to visualize the cut off for the assignment. For both the 
workload, we use capacity Mi — 50 for all i. 

The future electricity prices (3i,t for the next D time slots are 
randomly generated from Gaussian Distributions because of 
their high unpredictability and the volume (D) of generation as 
described in Section IIIA. We use the same mean but different 
variances for the generation in each time slot. We use the 
optimal daily coefficients for the price prediction filter from 
l[r6 | for estimating (rf [x]- Since we use the electricity prices 
for Wednesday (15th February, 2012), we choose ki = 0.837, 
^2 = and fcy = 0.142. For the previous standard deviation 
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(a) Workload A (b) Workload B 

Fig. 5. Impact of deadline on cost reduction by the offline and the online 
algorithm A™ in comparison to greedy algorithm. 
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(a) Workload A (b) Workload B 

Fig. 6. Comparison of cost incun'ed by the online algorithms A™, (AEM 
and AE in figure) for different deadlines. 



values (erf [x— 1], [X"''']), we use the past standard deviation 
of electricity prices for D slots on those days such that af [x — 

1] := std(ft.,[x-l],ft.K-i[x-l],---,A,«-i?[x-l]) and 
af [x-7] std(A,.[x-7], ft^«_i[x-7], . . . , /3,,«_c[x-7]); 
where std{-) denotes the standard deviation. The mean /if [x] 
is computed from the moving average of the prices for D 
previous slots on the current day x- 

B. Experimental Analysis 

We now evaluate and analyze the cost savings provided by 
the offline and online algorithms. 

Uniform Deadline: We compare the cost reduction for the 
offline and the online algorithm A™ with the greedy method 
without dynamic deferral. Figure [5] depicts the cost reduction 
for the online and offline algorithms for different deadlines. 
These curves show that dynamic deferral can provide around 
30% cost savings for deadlines of 12 slots (1 hour) and even 
for one slot we can get ^5% cost savings. Figure |6] illustrates 
the comparison of the total cost from algorithms yl^ and A™ 
for different deadlines. From this figure, we can see that the 
total cost from the algorithm A™ is always less than the 
total cost from the algorithm A^, as claimed in Lemma |4] 
As the deadline increases the total cost from the algorithm 
increases since the prediction error becomes significant 
for predicting more distant values (electricity prices) while 
the total cost from A™ is reduced due to migration and the 
flexibility of dynamic deferral. 

Nonuniform Deadline: We evaluate the cost savings for 
nonuniform deadline assigning different deadline by classi- 
fying the workload as shown in Table |l] For conservative 
estimates of deadline requirements (1-10), we found 15.64% 
cost reduction for Workload A and 9.23% cost reduction for 



Workload B each of which remains close to the offline optimal 
solutions. 

V. Related Work 

Greening data centers is becoming an increasingly im- 
portant topic in operating cloud-scale data centers for two 
main reasons: (1) the global energy crisis and environmental 
concerns (e.g. global warming) ifTTI and (2) increasing energy 
consumption in data centers |[T]. We now discuss the related 
work. 

Energy Management in data centers. With the importance 
of energy management in data centers, many scholars have 
applied energy-aware scheduling because of its low cost and 
practical applicability. Beloglazov et al. f2l give the taxonomy 
and survey on energy management in data centers. Lin et 
al. lfT2]| have tried to minimize the energy cost together with 
delay cost by rightly sizing data centers. Unlike their work, we 
focus on distributing requests among data centers in different 
locations considering fixed capacity for data centers. 

Geographical Load Balancing. The research community 
has recently identified the potential of reducing the operating 
cost in data centers by geographical load balancing based on 
the spatio-temporal variation in electricity prices. Qureshi et 
al. ||5| studied the problem of reducing the electricity cost 
in a wholesale market environment. They try to lower the 
electricity bill by utilizing the varying electricity prices in 
different locations of distributed data centers. They describe 
greedy heuristics and evaluate them on historical electricity 
prices and network traffic data. But they did not consider 
migration of workload and SLA requirements. In this paper, 
we utilize SLA information for load balancing and compare 
our algorithms with their proposed greedy algorithm. Rao et al. 
^ consider load-balancing of delay sensitive applications with 
the objective of minimizing the current energy cost subject 
to delay constraints in a multi-electricity-market environment. 
They used sophisticated queuing theory to restrict the average 
latency rather than utilizing the flexibilities from the SLAs 
and dynamic migration of workload. Buchbinder et al. fT\ 
presented online algorithms for migrating jobs between data 
centers, which handle the fundamental tradeoff between energy 
and bandwidth costs. For constant workload they could give 
bounded competitive ratio but for varying workload they 
presented a heuristic algorithm to reduce the computational 
complexity without making any probabilistic assumption about 
the future workload and future electricity prices. In contrast, 
we use the deadline requirements and use probabilistic as- 
sumptions to make scheduling decisions. There has also been 
some work on utilizing renewable energy for energy efficiency 
in data centers. Liu et al. ifTTl presented formulation for 
geographical load balancing without deadline and investigated 
how renewable energy can be used to lower the electricity 
price of brown energy. In contrast we consider migration 
of workload between data centers to utilize energy price 
variation via dynamic deferral. Le et al. |14| propose to cap 
the consumption of brown energy while maintaining service 
level agreements (SLAs). Unlike their method for satisfying 



SLAs, we utilize the flexibility from the SLAs for reducing 
energy consumption. Stewart et al. |10| try to maximize the 
use of renewable energy in data centers. However, they assume 
that data centers have their own energy sources (solar plants, 
wind mills, etc.). Unlike using renewable energy, we consider 
a different case where the cloud service providers buy energy 
from whole sale markets, which is a more common case for 
many data centers. 

Scheduling with deadline. Many applications in real world 
require delay bound or deadline constraint e.g. see Lee et al. 
||25l . When combining with energy conservation, deadline is 
usually a critical adjusting tool between performance loss and 
energy consumption. Energy efficient deadline scheduling was 
first studied by Yao et al. |26|. They proposed algorithms, 
which aim to minimize energy consumption for independent 
jobs with deadline constraints on a single variable-speed pro- 
cessor. After that, a series of work was done to consider online 
deadline scheduling in different scenarios, such as discrete- 
voltage processor, tree-structured tasks, processor with sleep 
state and overloaded system EJ\ . EM . In the context of data 
center, most work on energy management merely talk about 
minimizing the average delay but not give any bound on delay 
except Mukherjee et al. [29 1. They proposed online algorithms 
considering deadline constraints to minimize the computation, 
cooling and migration energy for machines. However, their 
work is for job assignment inside one data center without 
electricity price variation. 

Models for electricity price prediction. In |30], Gonzalez 
et al. presented the taxonomy of electricity price prediction 
models. Accordingly electricity price in wholesale market is 
not easy to predict due to the uncertainty of exogenous vari- 
ables (e.g. energy demand, water inflow, availability of gener- 
ation unit, fuel costs). People have tried predicting electricity 
prices using the autoregressive integrated moving average 
(ARIMA) model fJT], generalized autoregressive conditional 
heteroskedasticity (GARCH) model L32J . wavelet transform 
||33]| . dynamic regression and transform function model ll34l . 
Artificial intelligent methods that are also suitable for price 
forecast include artificial neural networks (ANN) |35 j, support 
vector machines (SVM) lf36l and Input-Output Hidden Markov 
Model (lOHMM) f30\. But all of these models are based 
on time series which are useful for predicting single value. 
In our algorithms we need to predict future D values where 
the time series methods do not perform well. Hence we use 
Gaussian distributions to generate future values where the 
mean is predicted by time series method and variance is 
estimated from previous history. Mohsenian and Garcia flQ 
recently proposed a simple and efficient weighted average 
price prediction filter to predict electricity prices based on the 
prices from previous day and the same day in previous week. 
In this paper, we use this model to estimate variance because 
of its low computational complexity. 

VI. Conclusion 

In this paper we have proposed online algorithms for 
geographical load balancing in data centers while guaranteeing 



the deadlines. The algorithms utilize the latency requirements 
of workloads as well as exploit the electricity price variation 
for cost savings and guarantee bounded cost and bounded 
latency under very general settings - arbitrary workload, 
general deadline and general energy cost models. Further 
the online algorithms are simple to implement and do not 
require significant computational overhead. To the best of our 
knowledge, this is the first formulation for load balancing with 
deadline utilizing the slackness in the execution of jobs for 
energy savings. 

Our experiments highlight that significant cost and energy 
savings can be achieved via dynamic deferral of workload. 
However the performance of the online algorithms depend 
on the price variation and the quality of prediction. In this 
paper, we tried to limit our motivation towards the cloud 
considering data centers as computation units. Other factors 
such as capacity provisioning, heterogeneity, availability of 
renewable energy etc. could be taken into account during load 
balancing decisions. We would like to consider these issues 
with load balancing in future. Also it would be interesting 
to carry out probabilistic analysis for cost saving in demand- 
response market. 
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